Efficient Algorithms for Large-scale Generalized Eigenvector Computation and Canonical Correlation Analysis
نویسندگان
چکیده
This paper considers the problem of canonicalcorrelation analysis, and more broadly, the generalized eigenvector problem for a pair of symmetric matrices. We consider the setting of finding top-k canonical/eigen subspace, and solve these problems through a general framework that simply requires black box access to an approximate linear system solver. Instantiating this framework with accelerated gradient descent we obtain a running time ofO ( zk √ κ ρ log(1/ ) log (kκ/ρ) ) where z is the total number of nonzero entries, κ is the condition number and ρ is the relative eigenvalue gap of the appropriate matrices. Our algorithm is linear in the input size and the number of components k up to a log(k) factor, which is essential for handling large-scale matrices that appear in practice. To the best of our knowledge this is the first such algorithm with global linear convergence. We hope that our results prompt further research improving the practical running time for performing these important data analysis procedures on large-scale data sets.
منابع مشابه
Heuristic and exact algorithms for Generalized Bin Covering Problem
In this paper, we study the Generalized Bin Covering problem. For this problem an exact algorithm is introduced which can nd optimal solution for small scale instances. To nd a solution near optimal for large scale instances, a heuristic algorithm has been proposed. By computational experiments, the eciency of the heuristic algorithm is assessed.
متن کاملEfficient Algorithms for Large-scale Generalized Eigenvector Computation and CCA
Moreover, it is well known that any method which starts at m and iteratively applies M to linear combinations of the points computed so far must apply M at least Ω( √ κ(B)) in order to halve the error in the standard norm for the problem (Shewchuk, 1994). Consequently, methods that solve the top-1 generalized eigenvector problem by simply applying A and B, which is the same as applying M and ta...
متن کاملCOMPUTATIONALLY EFFICIENT OPTIMUM DESIGN OF LARGE SCALE STEEL FRAMES
Computational cost of metaheuristic based optimum design algorithms grows excessively with structure size. This results in computational inefficiency of modern metaheuristic algorithms in tackling optimum design problems of large scale structural systems. This paper attempts to provide a computationally efficient optimization tool for optimum design of large scale steel frame structures to AISC...
متن کاملMultiview LSA: Representation Learning via Generalized CCA
Multiview LSA (MVLSA) is a generalization of Latent Semantic Analysis (LSA) that supports the fusion of arbitrary views of data and relies on Generalized Canonical Correlation Analysis (GCCA). We present an algorithm for fast approximate computation of GCCA, which when coupled with methods for handling missing values, is general enough to approximate some recent algorithms for inducing vector r...
متن کاملEfficient Estimation of Errors-in-Variables Models
The paper addresses the discrete-time linear process identification problem assuming noisy input and output records available for the parameter estimation. The efficient algorithms are derived for the simultaneous estimation of the process and noise parameters. Implementation techniques based on matrix and polynomial decompositions are given in details resulting in estimation algorithms with re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016